Wildfire cost the US economy billions of dollars every year to fight and to the general economy. Then there is the threat of death and health damage to citizens. In addition to theses human felt costs, there are the costs to the environment due to lost forests and carbon output. Some amount of wildfire is natural and will happen without human intervention, then there are those caused by humans.
The goal of this report is to look at the relationship of some high-level weather and economic factors on human caused fires. The thought being, with better understanding of human caused fires, we may be able to get a better understanding of what we can do to mitigate the wildfire damage inflicted upon ourselves.
The data used in this analysis was collected from a variety of sources of publicly available sources. The panel data set created from this collection is across all 50 states over the years 2018 through 2021. This time period is in part due to methodology of record keeping by nifc.gov. In years prior to 2018 the number of fires and acres were nut broken into the same level of granularity, making it impossible to create a longer data set without changing data sources. The National Interagency Fire Center (nifc) website was used to collect the fire statistics used in this analysis. The weather data used was collected from the National Oceanic and Atmospheric Administration (NOAA) website for collecting precipitation and temperature statistics for 49 of 50 states over the 4-year period. Hawaii data was not available within NOAA’s tool, resulting in Hawaii not being part of this analysis. Land area statistics were gathered from a data set originally from the National Wilderness Institute but pulled from ncrm.org. It should be noted that that with this time span and this high level of aggregation, that further research should be done to drill down further into the trends found in this report.
Some of the ideas leading to this particular analysis were to see if some high-level variables related to human behavior could be shown to have a link to human caused wild fire. Part of this analysis would need to take into consideration factors related to weather as it is common knowledge that dry things catch fire easier than lush green things. A variety of things have been said about causes for some of the large fires in recent years.
One example of such theories is that states that have more federally government managed land have more fires. To analyze this idea the number of square miles owned by federal or state governments was included in the correlation matrix below (Table 1). We find that it has a very low correlation with both the number of fires caused by humans and number of acres burned caused by humans.
Another human related variable that has been mentioned in other work and as a potential cause of wildfire is population density. This variable was also included in the below correlation matrix (Table 1) and we find that it is not correlated with either the number of fires caused by humans and number of acres burned caused by humans.
One variable that I have not encountered as being used in fire analysis is GDP. I felt this would be an interesting variable to use as it might be an indicator of things such as how much money is available to manage land, how developed a region is, and perhaps how well off the people in the region are. Multiple variables were created to measure these ideas. GDP for all industries was collected and used to create GDP per capita, GDP per square mile, and an interaction term GDP * Land Area. All of these variables were included in Table 1. The both the over all level of GDP and the interaction term were highly correlated with fires caused by humans and number of acres burned caused by humans, with the interaction being higher of the two.
Finally the weather variables were included. The variables collected include the annual precipitation, the precipitation in the center six months of the year (April - September), the annual average temperature, and the Max temperature average for the center six months of the year (April - September). I was actually surprised by the results from these variables. They were not as highly correlated as I would have expected. This may be due to the aggregated level at which they were collected. There were some worth while correlations within these variables, though they did not both apply to the two measures of interest. All of these measures were included in Table 1.
| Fires_Human | Acres_Human | Land_area | Pop_Dens | Average_Temp | Max_Temp_avg_C6 | Annual_Percip | Percip_avg_C6 | GDP_capita | GDP_Per_SQR_Mile | GDP_X_Land | C6_Interaction | Gov_sqr_MI_conv | GDP_All_Indus | |
| Fires_Human | 1 | 0.616 | 0.279 | -0.046 | 0.290 | 0.350 | -0.155 | -0.204 | 0.074 | -0.045 | 0.873 | -0.140 | 0.002 | 0.741 |
| Acres_Human | 0.616 | 1 | 0.192 | -0.081 | 0.099 | 0.164 | -0.314 | -0.404 | 0.137 | -0.063 | 0.575 | -0.365 | 0.095 | 0.534 |
| Land_area | 0.279 | 0.192 | 1 | -0.347 | -0.305 | -0.311 | -0.350 | -0.337 | 0.098 | -0.328 | 0.361 | -0.370 | 0.895 | 0.133 |
| Pop_Dens | -0.046 | -0.081 | -0.347 | 1 | 0.163 | -0.026 | 0.377 | 0.341 | 0.390 | 0.983 | -0.030 | 0.307 | -0.192 | 0.198 |
| Average_Temp | 0.290 | 0.099 | -0.305 | 0.163 | 1 | 0.827 | 0.427 | 0.400 | -0.244 | 0.116 | 0.239 | 0.526 | -0.389 | 0.295 |
| Max_Temp_avg_C6 | 0.350 | 0.164 | -0.311 | -0.026 | 0.827 | 1 | 0.134 | 0.132 | -0.401 | -0.063 | 0.256 | 0.295 | -0.478 | 0.230 |
| Annual_Percip | -0.155 | -0.314 | -0.350 | 0.377 | 0.427 | 0.134 | 1 | 0.897 | -0.169 | 0.335 | -0.201 | 0.887 | -0.232 | -0.021 |
| Percip_avg_C6 | -0.204 | -0.404 | -0.337 | 0.341 | 0.400 | 0.132 | 0.897 | 1 | -0.160 | 0.300 | -0.232 | 0.984 | -0.262 | -0.055 |
| GDP_capita | 0.074 | 0.137 | 0.098 | 0.390 | -0.244 | -0.401 | -0.169 | -0.160 | 1 | 0.493 | 0.225 | -0.230 | 0.135 | 0.393 |
| GDP_Per_SQR_Mile | -0.045 | -0.063 | -0.328 | 0.983 | 0.116 | -0.063 | 0.335 | 0.300 | 0.493 | 1 | -0.013 | 0.260 | -0.175 | 0.227 |
| GDP_X_Land | 0.873 | 0.575 | 0.361 | -0.030 | 0.239 | 0.256 | -0.201 | -0.232 | 0.225 | -0.013 | 1 | -0.189 | 0.056 | 0.849 |
| C6_Interaction | -0.140 | -0.365 | -0.370 | 0.307 | 0.526 | 0.295 | 0.887 | 0.984 | -0.230 | 0.260 | -0.189 | 1 | -0.329 | -0.026 |
| Gov_sqr_MI_conv | 0.002 | 0.095 | 0.895 | -0.192 | -0.389 | -0.478 | -0.232 | -0.262 | 0.135 | -0.175 | 0.056 | -0.329 | 1 | -0.051 |
| GDP_All_Indus | 0.741 | 0.534 | 0.133 | 0.198 | 0.295 | 0.230 | -0.021 | -0.055 | 0.393 | 0.227 | 0.849 | -0.026 | -0.051 | 1 |
After viewing these correlation measures some visualizations were created to better understand the variables that appear to be of most interest.
## No scatter mode specifed:
## Setting the mode to markers
## Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode
## No scatter mode specifed:
## Setting the mode to markers
## Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode
To further zero in on what variables are linked to these two metrics, volume of fires caused by humans and acres burned caused by fires, some regressions were fit using the above variables.
| Volume of Fires | |
| GDP_X_Land | 0.0000000144*** |
| (0.0000000006) | |
| Max_Temp_avg_C6 | 30.9101500000*** |
| (8.0139470000) | |
| Constant | -1,917.5470000000*** |
| (621.0920000000) | |
| Observations | 196 |
| R2 | 0.7795109000 |
| Adjusted R2 | 0.7772260000 |
| Residual Std. Error | 759.9087000000 (df = 193) |
| F Statistic | 341.1633000000*** (df = 2; 193) |
| Note: | p<0.1; p<0.05; p<0.01 |
| Acres Burned | |
| GDP_X_Land | 0.0000013434*** |
| (0.0000001505) | |
| Percip_avg_C6 | -7,041.2430000000*** |
| (1,398.6630000000) | |
| Constant | 183,559.1000000000*** |
| (35,058.9200000000) | |
| Observations | 196 |
| R2 | 0.4080201000 |
| Adjusted R2 | 0.4018856000 |
| Residual Std. Error | 192,146.9000000000 (df = 193) |
| F Statistic | 66.5122800000*** (df = 2; 193) |
| Note: | p<0.1; p<0.05; p<0.01 |
After looking at these variables we find some interesting results.
Weather is linked to human caused fire but the relationship is not as tight as you might think.
The weather variables used in this analysis did not have that high of correlation values and when put into regressions only one was is significant at a time. When predicting number of fires temperature was a significant factor. When predicting acres burned, precipitation was a significant factor.
Population density and who owns the land does not appear to be a factor.
The variables appeared to have low correlations to the target variables and when used in regression, did not show up as signigifant.
The interaction between income and amount of land is significant.
This variable combines two variables is interesting and should be researched further. GDP as a stand alone variable did have a high correlation to the target variables but it was more correlated when multiplied by the land area of the state. This may simply be because the more land a state has the more possible area for a fire to occur, but when land area on its own in the correlation matrix did not have a high value. This interaction term may represent the way in which the land is developed. In both predicting the number of fires and the acres burned, this interaction showed a positive relationship. This could be explored further with other measures of land and economic development perhaps by looking at things like what percent of land is used for agriculture, or what percent of land is forested.